Folding rate prediction using total contact distance.

نویسندگان

  • Hongyi Zhou
  • Yaoqi Zhou
چکیده

Linear regression analysis found that either contact order (CO) or long-range order (LRO) parameter has a significant correlation with the logarithms of folding rates. This suggests that sequence separation per contact and total number of contacts are both important in determining the rate of folding. Here, the two factors are incorporated into a new parameter, total contact distance (TCD). Using a database of 28 two-state or weakly three-state folding proteins, TCD is found to be the most accurate among the three parameters (CO, LRO, and TCD) in terms of correlation and prediction. It provides even more accurate prediction than the best neural network results with two descriptors (contact order and stability per residue). The improvement is achieved in all three-structural classes (all alpha, beta, and mixed). The accuracy of total contact distance in predicting folding rates is essentially unchanged if "short"-ranged contacts (absolute value of i - j < or = 14) are not included in calculation. Thus, only long-range contacts with a sequence separation of more than 14 residues are important in determining the rate of folding. This is consistent with the results from the long-range order parameter. One of the significant outliers in prediction is found to be associated with the only protein in the database that involves nonlocal disulfide bonds. Removing the protein leads to a correlation coefficient of 0.89 between experimental observed and predicted folding rates in jackknife cross validation. The corresponding values for CO and LRO are 0.71 and 0.80, respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sequence determinants of protein folding rates: positive correlation between contact energy and contact range indicates selection for fast folding.

In comparison with intense investigation of the structural determinants of protein folding rates, the sequence features favoring fast folding have received little attention. Here, we investigate this subject using simple models of protein folding and a statistical analysis of the Protein Data Bank (PDB). The mean-field model by Plotkin and coworkers predicts that the folding rate is accelerated...

متن کامل

Folding rate prediction using complex network analysis for proteins with two- and three-state folding kinetics

It is a challenging task to investigate the different influence of long-range and short-range interactions on two-state and three-state folding kinetics of protein. The networks of the 30 two-state proteins and 15 three-state proteins were constructed by complex networks analysis at three length scales: Protein Contact Networks, Long-range Interaction Networks and Short-range Interaction Networ...

متن کامل

NNcon: improved protein contact map prediction using 2D-recursive neural networks

Protein contact map prediction is useful for protein folding rate prediction, model selection and 3D structure prediction. Here we describe NNcon, a fast and reliable contact map prediction server and software. NNcon was ranked among the most accurate residue contact predictors in the Eighth Critical Assessment of Techniques for Protein Structure Prediction (CASP8), 2008. Both NNcon server and ...

متن کامل

CoinFold: a web server for protein contact prediction and contact-assisted protein folding

CoinFold (http://raptorx2.uchicago.edu/ContactMap/) is a web server for protein contact prediction and contact-assisted de novo structure prediction. CoinFold predicts contacts by integrating joint multi-family evolutionary coupling (EC) analysis and supervised machine learning. This joint EC analysis is unique in that it not only uses residue coevolution information in the target protein famil...

متن کامل

Analysis of rate-limiting long-range contacts in the folding rate of three-state and two-state Proteins.

In the past decade, when compared to models describing the folding rates of two-state proteins, models describing the folding mechanism of three-state proteins remain quite limited due to the complexity present in the folding mechanism and lack in their experimental data. In the present work, rate-limiting long-range contacts were classified into various bins based on sequence separation distan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Biophysical journal

دوره 82 1 Pt 1  شماره 

صفحات  -

تاریخ انتشار 2002